4 research outputs found
Task-Agnostic Graph Neural Network Evaluation via Adversarial Collaboration
It has been increasingly demanding to develop reliable methods to evaluate
the progress of Graph Neural Network (GNN) research for molecular
representation learning. Existing GNN benchmarking methods for molecular
representation learning focus on comparing the GNNs' performances on some
node/graph classification/regression tasks on certain datasets. However, there
lacks a principled, task-agnostic method to directly compare two GNNs.
Additionally, most of the existing self-supervised learning works incorporate
handcrafted augmentations to the data, which has several severe difficulties to
be applied on graphs due to their unique characteristics. To address the
aforementioned issues, we propose GraphAC (Graph Adversarial Collaboration) --
a conceptually novel, principled, task-agnostic, and stable framework for
evaluating GNNs through contrastive self-supervision. We introduce a novel
objective function: the Competitive Barlow Twins, that allow two GNNs to
jointly update themselves from direct competitions against each other. GraphAC
succeeds in distinguishing GNNs of different expressiveness across various
aspects, and has demonstrated to be a principled and reliable GNN evaluation
method, without necessitating any augmentations.Comment: 11th International Conference on Learning Representations (ICLR 2023)
Machine Learning for Drug Discovery (MLDD) Workshop. 17 pages, 6 figures, 4
table
DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
Predicting the binding structure of a small molecule ligand to a protein -- a
task known as molecular docking -- is critical to drug design. Recent deep
learning methods that treat docking as a regression problem have decreased
runtime compared to traditional search-based methods but have yet to offer
substantial improvements in accuracy. We instead frame molecular docking as a
generative modeling problem and develop DiffDock, a diffusion generative model
over the non-Euclidean manifold of ligand poses. To do so, we map this manifold
to the product space of the degrees of freedom (translational, rotational, and
torsional) involved in docking and develop an efficient diffusion process on
this space. Empirically, DiffDock obtains a 38% top-1 success rate (RMSD<2A) on
PDBBind, significantly outperforming the previous state-of-the-art of
traditional docking (23%) and deep learning (20%) methods. Moreover, DiffDock
has fast inference times and provides confidence estimates with high selective
accuracy.Comment: Under revie
DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models
Understanding how proteins structurally interact is crucial to modern
biology, with applications in drug discovery and protein design. Recent machine
learning methods have formulated protein-small molecule docking as a generative
problem with significant performance boosts over both traditional and deep
learning baselines. In this work, we propose a similar approach for rigid
protein-protein docking: DiffDock-PP is a diffusion generative model that
learns to translate and rotate unbound protein structures into their bound
conformations. We achieve state-of-the-art performance on DIPS with a median
C-RMSD of 4.85, outperforming all considered baselines. Additionally,
DiffDock-PP is faster than all search-based methods and generates reliable
confidence estimates for its predictions. Our code is publicly available at
Comment: ICLR Machine Learning for Drug Discovery (MLDD) Workshop 202
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems
Advances in artificial intelligence (AI) are fueling a new paradigm of
discoveries in natural sciences. Today, AI has started to advance natural
sciences by improving, accelerating, and enabling our understanding of natural
phenomena at a wide range of spatial and temporal scales, giving rise to a new
area of research known as AI for science (AI4Science). Being an emerging
research paradigm, AI4Science is unique in that it is an enormous and highly
interdisciplinary area. Thus, a unified and technical treatment of this field
is needed yet challenging. This work aims to provide a technically thorough
account of a subarea of AI4Science; namely, AI for quantum, atomistic, and
continuum systems. These areas aim at understanding the physical world from the
subatomic (wavefunctions and electron density), atomic (molecules, proteins,
materials, and interactions), to macro (fluids, climate, and subsurface) scales
and form an important subarea of AI4Science. A unique advantage of focusing on
these areas is that they largely share a common set of challenges, thereby
allowing a unified and foundational treatment. A key common challenge is how to
capture physics first principles, especially symmetries, in natural systems by
deep learning methods. We provide an in-depth yet intuitive account of
techniques to achieve equivariance to symmetry transformations. We also discuss
other common technical challenges, including explainability,
out-of-distribution generalization, knowledge transfer with foundation and
large language models, and uncertainty quantification. To facilitate learning
and education, we provide categorized lists of resources that we found to be
useful. We strive to be thorough and unified and hope this initial effort may
trigger more community interests and efforts to further advance AI4Science